Goal-directed clustering

نویسنده

  • Joel D. Martin
چکیده

This paper presents DP 1, an incremental clustering algorithm that accepts a description of the expected performance task -the goal of learning -and uses that description to alter its learning bias. With different goals DP1 addresses a wide range of empirical learning tasks from supervised to unsupervised learning. At one extreme, DP1 performs the same task as does ID3, and at the other, it performs the same task as does COBW~.B. A learning system’s performance goals and the way those goals interrelate can significantly influence learning and subsequent performance. In complex domains, those that have many probabilistic relationships between variables, a learner can waste valuable time inducing relationships that are irrelevant for performance. However, if they bias their learning according to a description of the expected performance task, they can attend primarily to relevant relationships. The contrast between focused and unfocused learning is a traditional distinction between supervised and unsupervised learning (Duda & Hart, 1973). The supervised learner is told that one target variable will be important at performance, and the learner can safely ignore irrelevant variables or relationships between such variables. On the other hand, the unsupervised learner has no such guidance and learns all predictive structure in the domain assuming that all will be useful. The predictive structure of a domain is the set of informative conditional probabilities between subsets of variables. If some subsets of variable values provide probabilistic information about the values of other variables, the domain has predictive structure. If there is no predictive structure, the domain is random. In this paper, we present an integrated algorithm that smoothly varies its learning bias depending directly upon a description of the anticipated performance tasks. This description is represented as a distribution of prediction tests. In a prediction test the learner encounters some variable values and must predict the value of another variable . By specifying the probability that a variable will be available and the probability that it will be tested, DP1 can be made to address the same task as ID3 (Quinlan, 1986), COBW~,B (Fisher, 1987), or Anderson Matessa’s (1992) method (Bc). 1 1 Expected distribution of prediction tests A performance task is supervised if a particular variable has a special status. At prediction, the learner expects to know the values for all the variables except the special one, and expects to have to predict the value of that target variable. More specifically, the probability that the learner will know the value of a variable at prediction time is 1.0 for non-target variables and 0.0 for the target variable. These probabilities will be called availability probabilities. Conversely, the probability that the learner will have to guess the value of an variable is 0.0 for all non-label variables and is 1.0 for the label variable. These probabilities will be called goal probabilities. 1Although Anderson and Matessa do not name their approach, for convenience, we refer to it as Bc to stand for Bayesian Categorization. 8O From: AAAI Technical Report SS-94-02. Compilation copyright © 1994, AAAI (www.aaai.org). All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering and Community Detection in Directed Networks: A Survey

Networks (or graphs) appear as dominant structures in diverse domains, including sociology, biology, neuroscience and computer science. In most of the aforementioned cases graphs are directed – in the sense that there is directionality on the edges, making the semantics of the edges non symmetric as the source node transmits some property to the target one but not vice versa. An interesting fea...

متن کامل

سیاستهای اقتصادی و مشکلات اجتماعی در ایران

Objectives: Goal-directed subsidies because of its nature can affect all aspects of Iranian life. This scheme is not only an economic program but also affects other aspects of social life including: welfare and economic situation of society, labor market, income, expenditure and consumption pattern models of family and social problems such as poverty and inequality, drug abuse, family probl...

متن کامل

سیاستهای اقتصادی و مشکلات اجتماعی در ایران

Objectives: Goal-directed subsidies because of its nature can affect all aspects of Iranian life. This scheme is not only an economic program but also affects other aspects of social life including: welfare and economic situation of society, labor market, income, expenditure and consumption pattern models of family and social problems such as poverty and inequality, drug abuse, family probl...

متن کامل

Response of the Pre-oriented Goal-directed Attention to Usual and Unusual Distractors: A Preliminary Study

Introduction: In this study, we investigated the distraction power of the unusual and usual images on the attention of 20 healthy primary school children. Methods: Our study was different from previous ones in that the participants were asked to fix the initial position of their attention on a pre-defined location after being presented with unusual images as distractors. The goals were prese...

متن کامل

A New Approach in Strategy Formulation using Clustering Algorithm: An Instance in a Service Company

The ever severe dynamic competitive environment has led to increasing complexity of strategic decision making in giant organizations. Strategy formulation is one of basic processes in achieving long range goals. Since, in ordinary methods considering all factors and their significance in accomplishing individual goals are almost impossible. Here, a new approach based on clustering method is pro...

متن کامل

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council

Supervised clustering is a data mining technique that assigns a set of data to predefined classes by analyzing dataset attributes. It is considered as an important technique for information retrieval, management, and mining in information systems. Since customer satisfaction is the main goal of organizations in modern society, to meet the requirements, 137 call center of Tehran city council is ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002